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Abstract 

Biological motor control provides highly effective solutions to difficult control 
problems in spite of the complexity of the plant and the significant delays in sensory 
feedback . Such delays are expected to lead to non trivial stability issues and lack of 
robustness of control solutions. However, such difficulties are not observed in biolog- 
• ical systems under normal operating conditions. Based on early suggestions in the 

d . control literature, a possible solution to this conundrum has been the suggestion that 

the motor system contains within itself a forward model of the plant (e.g., the arm), 
which allows the system to 'simulate' and predict the effect of applying a control sig- 
nal. In this work we formally define the notion of a forward model for deterministic 
^ i control problems, and provide simple conditions that imply its existence for tasks 

^ I involving delayed feedback control. As opposed to previous work which dealt mostly 

VT) • with linear plants and quadratic cost functions, our results apply to rather generic 

control systems, showing that any controller (biological or otherwise) which solves 



(N 



0> I a set of tasks, must contain within itself a forward plant model. We suggest that 

^ ' our results provide strong theoretical support for the necessity of forward models in 

. many delayed control problems, implying that they are not only useful, but rather, 

mandatory, under general conditions. 

: 1 Introduction 

The motivation for this vi^ork arose from biological motor control, which is plagued by 
inherent delays arising in sensory pathways, central processing units and motor outputs 
[HITO]. However, the results established shed light on any feedback control system, sub- 
ject to observation delays. Such delays, which in primates may reach 200-300 ms for 
visually guided arm movements, are very large compared to fast (150 ms) and intermedi- 
ate (500 ms) movements [HITO], and may lead to significant difficulties, as inappropriate 
control might cause instability or degraded performance. Delays have historically played 
a minor role in the field of robotics, as they can usually be made extremely small in 
such engineering applications. However, delayed state feedback has become increasingly 
important in engineering fields such as chemical control, distributed system control [16] 
and multisensory tracking [3]. In fact, one of the first attempts within the biological 
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motor control literature [l2] to address these issues was based on a well known concept 
from control theory, namely the Smith predictor [14]. However, one should keep in mind 
that in attempting to understand biological control systems, based on control theoretic 
principles, one is in fact trying to 'reverse engineer' an unknown system, as opposed to 
the task facing an engineer, namely designing a control system (see [18] for a survey of 
the possible role of control theory in systems biology). 

Within an optimal control based approach, one needs to specify a class of admissible 
control laws, a set of plant constraints (e.g., musculo-skeletal), and a quantitative defini- 
tion of performance, typically formulated in terms of a cost function. An optimal control 
law is then derived by minimizing a cost function subject to the relevant constraints. 
However, within a biological context, the precise nature of the plant and the controller 
is seldom known precisely, and the cost function used by the system (if indeed one is 
used), may also be unknown. It would thus be useful to determine general conditions for 
the necessity of a forward model, which require as few assumptions as possible. While 
a solution to the delay problem in the form of a forward model is indeed plausible and 
intuitively appealing [H], the question arises as to whether it is mandatory, namely, is it 
possible to construct an optimal closed-loop control law which is not based on a forward 
model? As we show in this paper, the answer to this question is negative, under very mild 
and reasonable conditions. More specifically, we show that (under appropriate conditions) 
an optimal feedback control law based on delayed state observations, must incorporate 
within itself a forward model of the plant. 

As far as we are aware, there is currently no general theory which provides precise con- 
ditions for which forward models are indeed necessary. Early work, mainly concerned with 
the linear case (e.g., [7l[lll[l7]), suggested several approaches to delayed control problems, 
including the proposal that a predictive plant model is needed, as in [H]. For example, 
[TT] showed that optimal control for linear systems based on minimizing a quadratic cost is 
obtained by cascading a Kalman filter and a least-mean square state predictor. Later work 
extended these results in various directions. For example, [T7] suggested an approach to 
dealing with disturbance attenuation and [l3], focusing on stability issues, extended these 
results to more general linear systems, showing that state prediction is indeed a necessary 
component of such controllers. A survey of many aspects of this work, circa 2003, appears 
in [8]. We note that much of this work has dealt with the design of actual controllers 
(often for linear systems and quadratic cost). As mentioned above, our perspective in 
this work is somewhat removed from controller design, as we are concerned with a reverse 
engineering problem. More concretely, we begin with an observed control system, operat- 
ing effectively under conditions of delayed state observations, and demonstrate that any 
effective controller must contain a forward plant model. Since it is hard, in general, to 
make even qualitatively correct assumptions about the system (e.g., linearity of dynamics 
and quadratic cost), we attempt to provide the most general result possible. 

Before proceeding to a detailed description of our results, we note that the notion of 
an internal model has played an important role in control theory also in other contexts. 
Francis and Wonham [6] were the first to show that stable adaptation (a.k.a. regula- 
tion) requires the existence of an internal model. Adaptation refers to a situation where 
the output of the system maintains a constant asymptotic value whenever the system is 
subject to inputs from some class of signals. Intuitively, such an internal model enables 



2 



the system to 'subtract' external inputs, thereby eliminating their long term effect on the 
system. Recently, a powerful extension of this theory was proposed in [15], where it was 
demonstrated to hold under very general conditions, without requiring the split into a 
'plant' and a 'controller' which was required in the original framework of [6]. Interest- 
ingly, this general theory has been applied to bacterial chemotaxis and shown to provide 
interesting novel insight in other biological situations as well. 

In summary, our main contribution in this work is the establishment of precise math- 
ematical conditions for generic deterministic delayed feedback control systems to possess 
an internal forward model (we comment on the extension to the stochastic setting in Sec- 
tion [H but leave the full elaboration of this direction to future work). The generality of 
the results, and the nature of the conditions required for them to hold, set the stage for 
the development, and experimental verification, of a rigorous theory of delayed feedback 
control in biological systems. 

The remainder of the paper is organized as follows. Section [2] presents an overview 
of the main results, outlining sufficient conditions for delayed feedback control systems 
to possess a forward model. Specifically, in section 12.11 we outline the problem, followed 
by a simple example in section 12.21 and a summary of the main results in section 12.31 In 
section 12.41 we apply the general ideas to linear systems with time optimal control and 
delayed state observations, while in section [2751 we consider the problem of minimum jerk 
control. Section [3] contains precise mathematical definitions and full proofs of the main 
results, including a full analysis of two examples presented cursorily in section [2l Finally, 
in section m we summarize our results and present some open research questions. 



2 Results 

This section contains a relatively informal summary of our main results. Precise defini- 
tions, assumptions, theorems and proofs appear in section [3l We begin by presenting the 
problem formulation, followed by a description of conditions for which a forward model 
is mandatory. We will then use the general necessary conditions established to show that 
in linear time optimal control and minimum jerk optimal control, based on delayed state 
observations, a forward model is indeed required. 



2.1 Problem definition 

Figure 1 about here 

Consider a system to be controlled, referred to as a plant. A plant is usually described 
by a state vector E X (1 W^. For example, in a 2-D motor control setting with joint 
torques as control inputs, the plant is a 2-D manipulator. Its state consists of a pair of 
joint angles and two velocities. Assuming that the joint angles take values in the range 
[0, vr], while the velocities can assume any real value, we have X = [0, vr]^ x M^. The plant 
state dynamics are typically given by a differential equation of the form 

xt = M^t,^t), (1) 
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where is the state of the plant at time t, denotes the temporal derivative of xf, 
Mt G t/ is the control at time t, chosen from a set of possible controls U, and Ap is a 
function mapping the state and control to R", namely Ap : X x U ^ M."' . In the above 
example of a 2-D manipulator, assuming that the torque is bounded in magnitude by 1, 
we have U = [—1, 1]^. 

In this work we study controllers possessing a memory which, as we demonstrate, is 
essential in the case of optimal control with delayed observations. The memory of the 
controller at time t can be conceived of as the controller's state at time t. For example, it 
is well known [H] that when controlling a plant with delayed state observation of duration 
D, using the previous controls {us} for t — D < s <t can be useful in order to calculate 
the current state of the plant. In this case the controller's memory can be described by a 
function x^(-), where = Ut-a for all < a < -D, namely the delayed control. 

In order to rigorously investigate the notion of a forward model and derive conditions 
for its existence, we quantify this notion mathematically in section [3TTt here we summarize 
the main ideas. In the deterministic delayed state feedback case considered here, we define 
a forward model by the ability of the controller to compute x^, the exact state of the plant 
at time t, given the delayed observation Xj_j^ and its memory x^(-). This ability to predict 
the exact state of the plant is equivalent to the existence of a transformation F such that 

= F(xj, (forward model). (2) 

In order to clarify the definition, consider a situation when a controller does not possess 
a forward model. This occurs when the relevant information available to the controller 
at time t does not suffice to determine the current plant state xf unambiguously. More 
precisely, based on the current relevant information, (xf_^,x^), the controller cannot de- 
termine xf. Note that the controller in our model has additional information beyond 
i^t-Dy^t) (see Figure [I]); as we claim later, this is irrelevant to the estimation of the 
current state xf . 

The need for a forward model can be established for many scenarios such as regulation, 
tracking and optimal control, and the proof is similar for all. We therefore use a common 
notion of tasks to refer to all the above. An example of a task for a 2-D manipulator is 
reaching some point x'^* on a plane within a prespecified period of time, or, alternatively, 
in minimum time. Another possible task would be holding the manipulator still for 10 
seconds. Clearly, one can envisage any number of such tasks. The set of all tasks of inter- 
est will be denoted by X*. Tasks are fed to the controller sequentially, and it is assumed 
that each task can be performed for each initial state. Note that the system is assumed to 
be causal, thus the controller has access only to the current task that should be performed 
and not to future tasks. The system described, based on delayed state observations, is 
illustrated in figure [H The solving set of control laws for task x*, up to time t, is denoted 
by f/j*(x^,x*) where x^ is the initial state of the plant. 

We will show in the sequel that the 'richness' of the set of tasks X*, and the corre- 
sponding control solutions f/j*(x^,x*) can make a difference, as to whether a controller 
solving the task must possess a forward model or not. For example, in section 12.21 we 
introduce a plant and a controller solving a linear time optimal problem. In the first 
case, where the set of target states is X* = [—1,1], we show that a forward model is 
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indeed essential. However, in the case where the set of targets is limited to two values, 
X* = { — 1,1}, we give an example of memoryless controller, which does not possess a 
forward model, while still solving the optimal control problem perfectly (i.e., a forward 
model is not needed in this case). 

Next, we introduce a switching process, zt, which defines the times at which new tasks 
are specified. Each task is assumed to be fixed between two consecutive task initiations. 
A precise definition of the switching process can be found in section [3l A control law is 
then defined by 

ut = Bc{xl,zt,xl,x\_^) , (3) 

where D is the observation delay, xl G X* is the task to be performed at time t, and Be is a 
given function. We have introduced the notation = x^{i_£)^^, where (a;)+ = max(0, x), 
in order to deal systematically with times t < D. In addition to the control signal itself, 
we consider the dynamics of the controller's state (memory). One standard formulation 
is in terms of a differential equation, 

x1{a) = Ac (x*, zu xl, Xt-D, a) 
•^oi^) ~ -^c(3^0) 3;q, q;) 

where, Ac and Dc are given functions describing the dynamics and initial conditions 
respectively. 

In the definition of a forward model, we stated that the relevant information available 
to the controller regarding the current state x^ is {x^_d, x^). The controller has additional 
information available at time t, consisting of x^ and zt. However, since a new task can 
be specified at any time (independently of the value of xf), the current state cannot 
depend on these values. 



2.2 Example - a simple linear time optimal control problem 

Figure 2 about here 

The abstract ideas introduced in the previous section are clarified through a simple exam- 
ple. Consider a linear one dimensional time optimal control problem, where the objective 
is to drive the plant (described by a single real- valued variable x^) , to a point x* G X* = X 
in minimum time. The plant dynamics are given by 

Xt = -Xt+Ut ; ute [-1,1]. (5) 

The minimum time cost function is given by 

J{xP, x*)= r Idt = Tj , (6) 
Jo 

where Tf is the first time for which = x*, and the initial state of the plant is x^. Thus, 
the controller needs to minimize J(x^,x*). The set of tasks here corresponds to reaching 
any state x* E X in minimum time. It is obvious that if X = [—1, 1], all the tasks can be 
performed, and the optimal solution in this case is simple and given by Ut = sgn(x* — xf). 
This is an example of a so-called bang-bang control, where the control switches between 
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its extreme allowed values; see figure [2] for a graphical illustration. 

Before proceeding to establish the existence of a forward model we summarize the gist 
of the argument. We start by assuming that a controller can solve a set of tasks X*, based 
on delayed state observation. We then argue by contradiction that if the controller lacks 
a forward model, then one can find a specific task x* G X* such that the controller will 
not be able to perform the task correctly, in contradiction to the assumption. Notice that 
the existence of such a task is a system related issue that has nothing to do with delays 
or a specific "black box controller", as will be explained in section 12.31 . 

The argument for the necessity of a forward model in the present example proceeds as 
follows (precise statements and proofs appear in sections 13.21 and 13. 3p . Assume that we 
are provided with a black box controller, which performs the linear time optimal control 
task optimally, based on delayed state observations. We will show that such a controller 
must possess a forward model. Assume to the contrary that it does not, thus there exist 
two distinct states, and s^, 7^ s^, such that the controller cannot determine whether 
the plant is currently in state = or = s^. In other words, the controller's available 
information relevant to the current state, namely Xj), does not suffice to determine 

Xj. This implies that there exist two trials (namely, two initial states, times ti and ^2 
and histories of tasks) such that the available information for both is identical, namely 
i^ti-Dy^ti) = (^ta-D'^fe)' s^'^h that xf^ = and xf^ = where 7^ s^. How- 
ever, if we specify an identical new task at times ti and t2, namely {xl^,Zt-^) = {x*^,zt2), 
the controller will choose Ut^ = Ut^ due to ([3]). On the other hand, consider the system 
dynamics (l5|), and choose x^^ =2:^2 = (-^^ + s^) /2, assuming, without loss of generality, 
that < . Based on the exact solution Ut = sgn(x* — xf), the optimal controls are 

= u^* = 1 and u^^ = u^* = — 1. However, based on the assumption that the forward 
model does not exist, we have shown that ut-^ = ut^, which contradicts the "correct task 
performing" assumption. Thus, in this example, a forward model is indeed required. 

In order to better understand the requirement for a forward model, we consider an 
example where such a model is not needed. Consider the example discussed above, but 
where the set of tasks (destination states) consists of only two points X* = { — 1,1}. In 
this case a simple memoryless controller such as Ut = xl is optimal, and clearly lacks a 
forward model. The reason for this is simple. When two states ^ are given, one 
cannot find a task x* such that the controls from and will difi^er. The reason is 
that if X* = —1, the controller has to use u = —1 and if x* = 1, it has to choose u = 1 
independently of the initial state. Intuitively, the controller is not required to know the 
exact state of the plant in order to be optimal (perform the task). This simple example 
and intuition will form the basis of our general proof in section 13. 2[ 

2.3 General results 

Having argued for the existence of a forward model in a simple linear example, we extend 
the results to a general setting. To do this, we need to specify when a controller works 
"well". Such a controller should perform all possible sequences of tasks correctly, which 
means that at each time, ti, where a new task is given, the control signal for the task 
should belong to the set of controls f^t*+i-ij(^ti5 ^tj performing the tasks correctly between 
the times ti and ti+i. We will refer to such a system as a Correct Task Performing System 
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(CTPS); a precise characterization is provided in definition [6l This definition, based on 
the assumption that the task can always be solved, allows one to build a state feedback 
controller easily. We show that under these circumstances, a "delayed state feedback con- 
troller" can be built as well. We refer the reader to Theorem [7] for a precise statement of 
the result. 

The proof of Theorem [7| is based on building a controller that uses delayed obser- 
vations, by defining the memory of the controller to be x^{a) = Ut-a- Then, given the 
observation x\_j^ , the current state can be reconstructed by solving the differential 
equation for the plant with initial condition where the previous controls are taken 

from the memory. Once the real state is available, we can choose the control from the 

set u;. 

As demonstrated in the simple example presented in section 12.21 a forward model may 
not always be necessary. As shown in section 13.21 the necessity of a forward model can be 
demonstrated in situations where the problem is sufficiently 'rich'. In the example above, 
when the task set is binary, namely X* = { — 1, +1}, no forward model was required, while 
if X* = [— 1,+1] a forward model is indeed required. This idea of problem richness is 
formalized in section 13.21 We will refer to a problem as sufficiently 'rich' by saying that 
it does not contain Non Separable by Correct Task Performing (NSCTP) pairs of states; 
see definition [8] for a precise characterization. Intuitively, we say that a pair of states is 
NSCTP when for every task, the same correct control exists at time for both states 
(however, the control may differ for each task). The main contribution of this paper, 
Theorem [9l establishes the existence of a forward model when NSCTP pairs of states do 
not exist (i.e., the absence of NSCTP pairs of states is a sufficient condition for a forward 
model to exist). 

As a specific illustration of this idea, let us look back at the example in section [2^21 We 
implicitly proved there that the system does not have NSCTP pairs of states by finding a 
task X* = (s^ + s^) /2, and showing that it leads to u^* = 1 and u^* = —1. The existence 
of a forward model in this case (and in more general cases to be studied in the sequel) 
follows from theorem [9l 

2.4 Linear time optimal control 

We consider an optimal setpoint tracking problem within linear control theory. The ob- 
jective here is to reach, from an arbitrary initial position, a predefined setpoint x* in 
minimal time. In this case X* = X = M". 

The cost function J, penalizing for time expended on the task, is 



where the initial state is x . The plant's linear dynamics are described by the ODE 




(7) 





m 



(8) 
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where M and are matrices of dimensions n x n and n x m respectively. The results 
can be generalized to more complicated sets of controls. We use theorem [9] to provide 
sufficient conditions for the existence of a forward model in this case. This is done by 
showing that linear time optimal control with delayed state feedback has no NSCTP pairs 
of states, thereby fulfilling the necessary conditions of the theorem. The precise statement 
of this result is provided in Theorem [T3l 

The proof that the system has no NSCTP pairs of states is based on geometrical 
properties of accessible sets, and can be found in section [331 Using Theorem [T3l the need 
for a forward model in the simple example presented in section 12.21 can be established 
trivially, since the matrices M and are given by M = — 1 and = 1, which leads to a 
normal system (a required assumption for theorem [131) , and the set X = [—1, 1] satisfies 
the other assumptions needed. 

2.5 Minimum Jerk Optimal Control 

Many models for the control of human arm movements have been suggested in an attempt 
to explain experimental results. The minimum jerk model was probably the first approach 
to address these issues based on optimal control principles [5]. In this approach, a two 
degree of freedom manipulator endpoint is controlled on a plane by applying jerk (the 
third derivative of the position). The task that the system should perform is taking the 
plant from some initial state to a final state in time T, minimizing the total accumulated 
squared jerk. We show that such a problem, where T is a part of the task, possesses no 
NSCTP pairs of states, and therefore by theorem [9l a CTPS controller based on delayed 
inputs must contain a forward model. 

In this model the state consists of the end-point of the manipulator's displacement, 
velocity and acceleration in a plane, 

xP = {x, y, X, y, X, y)^ = (x, y, u, v, z, w)'^ , (9) 

with dynamics 

F = x,?/,(5,7)'^, (10) 

where 6 and 7 are the controls, namely u = {'x , y)^ = (5,7)^ . We define a task termed 
optimal setpoint tracking in constant time where the plant must be controlled so that 
it reaches some state x^*, with zero velocity and acceleration, while optimizing a cost 
function J, when the initial state of the plant is x and the time for reaching the goal 
is T (which is itself part of the task). Therefore the task is given by x* = (x^*,T) and 
x^* e X, where 

X = {xeX\u = v = z = w = 0}. (11) 
The cost function is ^ 

J(x^,«,T) = ^^ {x^, + y',)dt, (12) 

with initial conditions Xq = {xo,yQ,Xdo,ydOjXddOyyddo)^ and boundary conditions x^ = 
(xt,2/t, 0,0,0,0)^ = xP*. 
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As was shown in [5], each coordinate, x and y, can be computed separately and 
identically, and the solution for x has the following form 

Xt = aQ + ait + a2t^ + a^t^ + 04^:^ + Ost^, (13) 

where the constants depend on T, on the initial conditions and on x^*. Theorem [TBI 
proves that for this system a forward model is indeed essential. The proof is based on 
theorem [9] after showing that the system has no NSCTP pairs of states. 

Note that when T is constant and is not a part of the control task, the system has an 
infinite number of NSCTP pairs, and a similar proof will not work because it relies on 
the absence of NSCTP pairs in the system. However, this does not imply that a forward 
model is not needed, but rather that higher order conditions may be required. 

3 Methods and Detailed Proofs 

In this section we rephrase, in a formal mathematical language, the ideas and results intro- 
duced and presented intuitively in section [2l We begin with several technical definitions 
which will be required in the sequel. 

3.1 Basic definitions 

Let X C M" be a set of states and U C the set of possible actions that the controller 
can choose from. We use an underline to denote the history of a dynamic variable between 
time zero and time t, e.g., : [0,t] U and similarly for arbitrary times [^1,^2] we use 
—[ti,t2\ '■ [^15^2] U. Denote by Ut the set of possible piecewise continuous controls that 
can be selected up to time t, namely Ut = {u^ : is piecewise continuous on [0, t]}. The 
plant is given in ([T]). 

We introduce a set of tasks to be solved, and a set of controls which solve these tasks. 

Definition 1. Let X* be a set of tasks that need to be solved by the controller, and let 
X* be a specific task. The set of task solving controls, Ul{x^,x*), consists of all piecewise 
continuous control laws, in the interval [0, t], that lead to the performance of task x* when 
the initial condition is x^. 

In the case where the task is completed for r < t , the remaining controls are arbitrary, 
namely U*^^^ = Ut-r- Since we consider situations where the controller executes a series 
of tasks, we define the switching task process. 

Definition 2. The switching tasks process Zt is defined by Zt = J2ilo^i'^ ~ ^i)^ where ti 
are the times at which the tasks are switched, and 6 (■) is the Dirac impulse function. 

The controller is given by ([3]) and its state dynamics (memory) by (Ilj). While other 
definitions of memory may be considered, we limit ourselves in this letter to the present 
formulation. We assume that the task definition process x^ is constant between two task 
switches. It will be convenient in the sequel to assume that the state space contains all 
states reachable for any allowable control law. 

Definition 3. The set X C M is inescapable when for all initial conditions Xq G X , and 
controls EUt, the state at time t remains in X, namely x^ E X. 
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In principle, the task solving control laws are not necessarily continuous. We introduce 
a subset of continuous control laws. 

Definition 4. For any e > 0, the set U*{x^,x*) = {u^ E U*{Xf,x*) : Me is continuous}, 
consisting of all continuous task solving controls, is termed the continuous task solving 
control set. 

Next, we formally introduce the idea of a forward model. 

Definition 5. A controller possesses a forward model when there exists a transformation 
F such that for all times t, initial conditions Xq, switching sequences z^, and tasks x^, the 
state is given by = where = x'^(^t-D)+- 

In section [32] we provide precise conditions that imply the existence of a forward model. 

3.2 General Results 

The present section is constructed as follows. Initially, a system (plant and controller) 
with good performance is defined (definition [6|) . We then show that such systems can be 
implemented even when the state observation is delayed (Theorem [7]) . Finally, whenever 
the problem is not too trivial (see definition [8]) , we show that the controller must possess 
a forward model (Theorem [9l) . 

Several assumptions are required before proceeding to the main claims. We assume 
that all possible sequences of tasks in X* can be performed by a controller from any initial 
condition in X, and we also require that X cannot be escaped by applying legal controls. 

Assumption 3.1. For each task x* G X* and initial state x^ E X , a piecewise continuous 
solution exists, namely, for any value oft, U^{x^,x*) ^ . 

In the sequel we will compare two control laws in a small interval around t = 0. In order 
to do so, based on the values of the controls at t = 0, we need to assume the existence of 
a small interval over which the task solving controls are continuous. In other words, for 
each task x* G X* and state x'p e X, there exists eo{xt,x*) > s.t U*^{Xi,x*) ^ 0. The 
existence of such an interval follows directly from assumption 13. 1[ 

Assumption 3.2. The set X is inescapable. 

Given a "black box controller" satisfying certain conditions, we will demonstrate the ex- 
istence of a forward model. 

Assumption 3.3. A task solving "black box controller", which provides a piecewise con- 
tinuous and continuous from the right control signal Ut, is given. 

Next, we define a "correctly performing system", namely a system which executes all 
possible sequences of tasks correctly. 

Definition 6. The controller, plant and task space constitute a Correct Task Performing 
System (CTPS) when for each x^, Xg, t,_z^^ 

'^[ti,mm{ti+-i,t)] ^ f^mm{t.^-^,t)-t,(^v ^tj for alH G {j : tj < t}. 
In other words, the controller always selects a signal Ut solving the sequence of tasks. 
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At this point we show that if there exists a controller without delay that renders the 
system CTPS, then a controller with delay can render the system CTPS as well. The 
intuitive idea is that in a deterministic system, the state of the controller can store all 
past controls, and thereby simulate the plant in order to predict the current state. 

Theorem 7. Let Ap be a deterministic plant as in p\) with state variable . Then under 
assumptions \3 . 1\ and \3.2. there exists a controller of the form ^B) such that the system is 
CTPS. 

Proof. Define FM£)(xq, M[o,z)]) to be the solution of the dynamics of the plant at time D, 
when the initial state of the plant is x'q and the control until time D is wp^D]- Now, define 
the state of the controller at time t to be 

x?(a) = [t,<]^eM'"+\ (14) 

where m* is the correct control at time a for performing the task. Note the m* can be 
defined even for a > t assuming that x*. = ioY t < r < a {x* does not change) . In (fTH) 
we separate the first component of the control state (representing time) from the other 
components, and use x1i{a) for the former and a;j2(tt) for the remaining m-dimensional 
sub-vector consisting of m*. We also define a projection of xl on an interval [a, 6] to be 
xl[a,h]. The state x^ defined in (fT^ is obtained by the dynamics 

1 

where we recall the definition [2] of the switching sequence {zt} in terms of an impulse 
train. The future control is selected from the correct solution set of controls. More 
formally, defining t) = mm[Xfi,D], we set £f = FMjj(^x^ ^t,2[^t,i ~ ^ ^ ^t,i]j 
u*^ G (xf , x^). In other words u*^ is chosen from the correct solution set U^{x^, x^) 
where x^ = x^ is the exact prediction of the current state using the forward model FM. 
For such a definition of the memory, the control can be chosen by 

Ut = B''{Xt,Zt,X^,Xt-D) 
= 2 {Xt,l) ■ 

It is obvious that the control between task switches is chosen so that belongs to 

the set ?7f.^^_f^(xf , ) for each i, and therefore it is CTPS. □ 

Next we introduce a property whereby two states may be "united" in terms of the 
solution to tasks, and therefore cannot be distinguished. For such states, for each task, 
there exists a continuous control such that controls at time are equal. The absence of 
such pairs will enable us to guarantee the existence of a forward model in a controller. 

Definition 8. For a problem where assumption 13.11 holds, a pair of distinct states x^and 
x'P, x^ 7^ x'^, is called a Non Separable by Correct Task Performing pair (NSCTP) if for 
all X*, and < e < min{eo(x^, x*), eo(x'^, x*)}, there exist controls u G U*{x^,x*) and 
u' G JJ*{x'P,x*) such that Uq = Uq. 

The following theorem constitutes the main theoretical result in the paper. It provides 
sufficient conditions for the existence of a forward model in delayed state feedback control. 
A required condition is the absence of NSCTP pairs in the system. 



x^(a) = 



11 



Theorem 9. Let Ap be a deterministic plant as in and assume that NSCTP pairs of 
states are absent from the system. Let (x'^, B^) be a controller with delayed state feedback 
which renders the system CTPS. Then, under assumptions \ 3. 1[ \3.2 and \3.3[ there exists 
a forward model F such that for each t, any initial condition Xq, and history of tasks 

•^t — l-'-t) -^t-DJ ■ 

Proof. Assume that the system is CTPS and assume by negation that such a forward 
model F does not exist. Therefore there exist times ti and ^2, controller states x^^(-) = 
x1^{-), and plant states x^_^_j^ = x^^-d ^^ch that for some two trials {xq,xI_^, z^^,ti) and 
{xQ,x!^^,z^^,t2) we get 7^ xj^. But the controller of the form (JSj) chooses the action by 
the rule Ut = B'^{xl,x^, Zt, x^t-o)- Therefore, for each new task x* set at times ti and t2 for 
the two trials (since the system is inescapable and a solution always exists), we have ut^ = 
and since Ut is continuous from the right and piecewise continuous, there exist eo > 
such that Ut- are continuous on [ti,ti + e^] for i G {1,2}. Thus, from the assumption that 
the system is CTPS, it follows that for all x* and < e < min(eo, ei(xfj, x*), e2(x^2, x*)), 
U[ti,ti+e] £ U*{x^_^,xl) and U[t2,t2+t] ^ U*{x^^,xl) (it suffices to look at the continuous 
solutions since we know that the control signal is piecewise continuous for the "black box 
controller"). But this means that the pair of distinct states xf^ and x^^ is NSCTP which 
leads to a contradiction with the assumption that no such states exist. □ 



3.3 Linear time optimal control 

We consider two examples demonstrating the general claims established in section 13.21 
We begin in the present section by considering a linear control problem where the task is 



defined as optimal setpoint tracking, introduced in section \2.4\ The objective here is to 
minimize the time required to reach the desired state with linear dynamics and delayed 
observations. The formal task is described in definition [101 In order to simplify the 
notation, we will omit the superscript p from x^ in this section. Some background results 
required in this section, and alluded to below, are taken from [9]. 

Definition 10. Let X* = X and x* G X*. The task is an optimal setpoint tracking task 
when 

Ut{x,x*) = argmin J{x,u,t), 

U,t\Xt=X* 

namely, the controller must take the plant state from the initial state x to the desired 
state X* while minimizing the cost function J. 

The time optimal cost function and the dynamics are given in (I7j) and (l8|) respectively. 
Let U[o,t] be a given control law. Then it is well known that 



xt = Xtx° + Xt / X;^Nusds, (15) 
Jo 

where the matrix Xt is the solution of the system, 

Xt = MXt with Xo = /, 

which can be written explicitly as Xt = e^^^ . 

The existence of a forward model in this case will be demonstrated under the following 
assumptions that are needed to prove the absence of NSCTP pairs of states, and to fulfill 
the assumptions of theorem [9l 
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Assumption 3.4. The system is essentially normal (as defined on p. 65 in f^). The 

term "essentially" implies that a property holds almost everywhere - except on a set with 
measure zero. For simplicity, the term "essentially" will be omitted from now on in the 
context of normal systems. The set X is a controllable and inescapable set (see Section 

The general definition of a normal system is somewhat intricate. However, for a time 
independent linear system of the form ([8]), theorem 16.1 in [9] establishes that the system 
is normal if and only if for each j = 1, . . . ,m the vectors N, MN\ . . . , M^~^N^ are linearly 
independent, where A^-' are the column vectors of the matrix N. The exact conditions on 
the matrices M and needed for the set X to be controllable and inescapable require 
further analysis. However, a condition such as stability of M insures the existence of a 
set X with E X, that will be both controllable and inescapable. 

As stated above, the main results in the present section rely heavily on basic concepts 
and theorems from [9]. For ease of reference, we recall some basic notions. 

Definition 11. Let K(t,x^) be the accessible set at time t, starting from namely 

K{t,x^) = {x : there exists u which steers from x° to x at time t}. 

The following two key observations about normal systems are taken from [9]. 

ic For a normal system, K(t, x") is strictly convex, bounded and closed. 

•k For normal systems, an optimal control law always exists, is unique and is essentially 
determined by u* = sgn.^r]'^ X^-* X^^ N) ioi x^,x* E X , where r] is an outward normal 
to K(t,x^) at X*, and the trajectory xJqj] is unique. 

We begin by proving a basic lemma that establishes some properties that are required 
in order to show that the system does not possess NSCTP pairs of states. The lemma 
establishes geometric properties of two intersecting accessible sets. A sketch of the ideas 
underlying the lemma is presented in Figure [3l 

Lemma 12. Let x^,x'^ E X andx^ ^ x"^ , and define = sup {r : K{t,x^) H K{t, x^) = 0}. 



Then under assumption \3.4 

1. r™ < oo. 

2. K{T''\x^)nK{T''\x'^) = {x*}. 

3. There exists an outward normal g to a supporting hyperplane to K{t"^,x^) at x* 
and —g is an outward normal to a supporting hyperplane to K(t"^,x'^) at x*. 



Figure 3 about here 



Proof. For a normal system there exists an optimal control, namely for all x , x* E X 
there exists a r (might be infinity) such that x* E dK{T,x^) (by Theorems 14.1, 14.2, 
15.1 and Corollary 15.1 in [9]). Define L = K{t'^,x^) D K{t"^,x^). 

Proof of 1: Assume by negation that t"^ = oo. We know also that X = lim^^oo K{t, x^) = 
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limr^oo K{t,x^) from assumption I3.4[ From the definition ofr"^, under assumption that 
t"^ = oo there must exist G X° such that for all r < oo, a;'^ ^ K{t,x^) (without loss of 
generality). If r"^ = oo, then there is no optimal control from x^ to x^ which contradicts 
the existence of time optimal solution. Therefore r*" < oo. 

Proof of 2: First let us show that L ^ ^. Assume by negation that K{t"^, x^)nK {t"^ , x^) = 
0. Since K is closed, strictly convex and compact (from Lemma 12.1, Corollary 15.1 in 
[9] and r™ < oo), the sets K {t"^ , x^) , K {t"^ , x"^) are strictly separable by a hyperplane 
f{x) = a ■ X + b i.e., there exists e > such that for all y G K{t"^,x^) f{y) < — e, 
and for all z G K{t"',x'^), f{z) > e by Proposition 2.4.3 [l9]. Define = inf{r : 
K{t,x^) n {x : fix) = -e} ^ 0} and = inf{r : K{t,x^) H {x : /(x) = e} ^ 0}. 
Notice that K{t^,x^) fl {x : f{x) = — e} contains at most a single point since K is 
closed, strictly convex and an optimal control always exists. The same argument applies 
to K{t^,x'^) n {x : /(x) = +e}. Therefore K{t'^,x^) H K{t^,x'^) = 0. It follows that also 
for r° = min(r\r2), we have that K{t°,x^) n K{t^,x'^) = 0. But r° > r"", and this 
contradicts the definition of r™, therefore L 7^ 0. 

Next we show that L° = . Assume by negation that it is not and let x E L°. There- 
fore X G K°{t"^,x^) and x G K°{t'^', x'^), but from the definition of r"^ for all e > 0, 
X ^ K{t* — e,x^) or X ^ K{t"^ — e, x^) (without the loss of generality assume that 
X ^ K{T"^ — e,x^)) and K is monotonic in r. Therefore for all M > 0, x G K°{t'^ + M,x^) 
which leads to a contradiction that there is no optimal control from x^ to x as should be 
by Theorem 15.1 and Corollary 15.1 in [9|. Therefore L° = 0. 

Let us show that L cannot include more than a single point. Since L° = and L is strictly 
convex, then if x,x' E L and x 7^ x' then a convex combination should be in L. But since 
L is strictly convex , the convex combination cannot be on dL or in the interior of L since 
it is empty. Therefore L can contain only a single point. 

Summarizing the above, L is not empty and can contain only a single point, therefore 
L = {x*}. 

Proof of 3: Define Ki = K(r"^,x^) and K2 = K{t"\x^). Since K° n K° = $ and 
Ki, K2 are convex, we can use the separating theorem for K^, K2 (Proposition 2.4.2 p^). 
Thus there exists a (7 G M", 7^ 0, such that for all x G K^, x' G K^-, g ■ x < g ■ x' . 
Now let Xn G K2 such that Xn — ^ x* thus g • {x — x„) < and therefore g ■ {x — x*) < 0. 
Similarly, we find that 5^ • (x' — x *) > 0. Since the functional {g ■ x) is continuous, the 
same is correct for y G Ki, y' G K2, i.e g ■ {y — x*) < and g ■ {y' — x *) >0. Thus g, —g 
are outward normals to supporting hyperplanes Ki,K2 respectively. □ 

Using lemma[12]we will establish that the system does not possess NSCTP pairs of states, 
and thus the need for a forward model will follow from theorem [9l 

Theorem 13. Consider a linear normal system described by and assume that a 
controller with delayed input renders the system CTPS for an optimal setpoint tracking 
task, where the cost function is given by If x^ is the memory state of the controller, 
and assumptions \ 3. 3\ and\3.4\ hold, then there exists a forward model F such that for each 



t, any initial condition Xg, and history of tasks {xl,z^), 

x\ = F{xl,Xt-D)- 

Proof. First notice that assumption 13.31 holds since we required the system to be control- 
lable, and from Theorem 13.1 in [9], the minimizer exists. Thus the task can always be 
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performed, and from the normality of the system it follows that the time optimal control 
reaching x* is bang-bang, which implies that is not empty. First we will show that the 
system has no NSCTP pairs of states, and then use theorem [9] to establish the existence 
of a forward model. 

For a normal system, the time optimal control reaching x* is given by 

Ut = ^gn{ri^Xr.X;^N), 

where 77 is an outward normal to a supporting hyperplane to K{t*,x'^) at x* (except on 
a set of measure 0) . It is essentially unique (may differ over a set of times with measure 
0) by Theorems 14.1, 14.2, 15.1 and Corollary 15.1 in [9]. Now, let x^ and be two 
distinct points in X, then by lemma [12] there exists x* which is reachable from x^ and x"^ 
in time r* = (since x* G dK{x^,T''^) and x* E dK{x^,T'^)) by time optimal control, 
and there exist outward normals rji = g and 772 = —g. Since Xt does not depend on the 
initial conditions, 

ul = -u\ ^ 0, 

Ut is piecewise continuous and Uq 7^ Uq. Therefore for an arbitrary x^ and x'^ we have 
found X* such that the solution is unique and 7^ Uq . Thus the system does not possess 
NSCTP pairs of states. We have shown that all the assumptions required for theorem [9] 
hold, and therefore there exists a forward model F such that x^ = F{x'i,Xt-D)- D 



3.4 Minimum Jerk Optimal Control 

In this example, the plant's state, dynamics, control and cost functions are given in fl9lfT2l). 
The initial and terminal conditions are given in section 12.51 The solution trajectory is 
given in ( fT3l ). where the constants are found using the initial and boundary conditions. 
Taking three derivatives of (fT3|) and setting t = 0, we obtain 

_ 60 60 36 9 

First, notice that for a constant value of T, there exist NSCTP pairs. Each x = (xq, Xdo, Xddo) 
and x' = {x'q,x'^q,x'^^q)~^ such that 

60 36 9 _ 60 , 36 , 9 , 

~ jis"^" ~ ^^dO ~ T^Xdda — —^Xq — T^XdQ — T^Xddo 

are NSCTP pairs (there are infinitely many of these) since for each xt the optimal control 
at time is 

So{xt) = 5'q{xj.). 

This result does not imply that in the present case a forward model is not needed, but it 
does imply that a higher order condition may be required in order to prove it. 

Assume then that the terminal time T can vary. For this case we will prove in theorem 
[TBI that a forward model is essential. First we will show in Lemma [15] that the system 
does not have NSCTP pairs of states, and then use theorem [9] to establish the claim. 
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Definition 14. Let X* = XxR , where X CX, and x* = {xP%T) G X* . The task is an 
optimal setpoint tracking in constant time task when 

U^{x,x*) = argmin J(x, M, T), 

u\xt=xP* 

namely, the controller must take the plant state from the initial state x to the desired 
state in time T, while minimizing the cost function ,/. 

In the present case the subset X is given by (fTTl) . 

Lemma 15. The system ^ with dynamics ( fl^j solving an optimal setpoint tracking in 
constant time task, and minimizing the cost function fT^) has no NSCTP pairs of states. 



Proof. First, the control (5,7) is continuous, therefore U^{x^,x*) C U^(x^,x*). The 
solutions are unique, therefore we just have to find a task x* where the controls at time 
are different for 2 initials states. 

Let Xq 7^ Xq be two initial states. Assume, without loss of generality, that the x 
coordinate's initial conditions are different in the two initial states, i.e., x = {xq, Xdo, Xddo) 
and x' = (a^o, 2:^0' ^ddo)^ ^^^^ ^ ^- To show that the states are not a NSCTP 
pair we have to find T and xt such that Sq{xt) 7^ S'q{xt), where S*{xt) and S[*{xt) are 
the optimal controls to xt from the initial states x and x' respectively. Recall that the 
optimal control at time is given by 

60 60 36 9 

The necessary and sufficient condition for equality of the controls 5q{xt) = Sq{xt) is 
Q{xddo - Xddo)T'^ + 36(xdo - x'^f))T + 60(xo - Xq) = 0. 

Since this is a second order polynomial in T, there can be at most 2 roots Ti and T2. Let 
T 7^ Ti,T2 and let Xf be an arbitrary position, thus for T,Xf, 6^ 7^ 6'q which means that 
the pair of states Xq 7^ are not a NSCTP pair. □ 

At this point we are ready to prove the existence of a forward model. 

Theorem 16. A "black box controller" with delayed state feedback fulfilling assumption 
\3.3{ which renders system ^ with dynamics ^^) CTPS for an optimal setpoint tracking 
in constant time task with cost function must possess a forward model. In other 

words, there exists a forward model F such that for each t, any initial condition Xq, and 
history of tasks {x*,z^), 

Proof. First, assumptions 13. II and 13.21 hold trivially since the optimal trajectory is unique 
and continuous, and there exists a polynomial solution for each x^ E X and x* G X* . 
From Lemma [15] we have that the system does not have NSCTP pairs, and therefore by 
Theorem [9] there exists a forward model F such that for each x^^.x^.t 



„P — p(^c ~p \ 



□ 
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4 Discussion 



We have studied the general problem of control based on delayed state observations. For 
this purpose we have formalized the notion of a system solving a set of control tasks, 
which is general enough to cover many of the standard control settings such as regulation 
and tracking. Under rather mild conditions on the system, we have shown that such a 
controller must contain within itself a forward model. This implies that the current plant 
state can be exactly determined based on the delayed state observation and the internal 
controller state. We applied our general framework to two widely studied problems, linear 
time optimal control and minimum jerk control, and provided explicit conditions for the 
necessity of a forward model. These results, and the general framework itself, provide 
powerful mathematical support for the existence of forward models in biological motor 
control, and, in fact, in any control system with delayed feedback. 

A possible limitation of our approach is its restriction to deterministic systems, as the 
notion of a forward model used here is clearly inapplicable in a stochastic setting. Since 
in a stochastic setting one cannot determine the state precisely, a reasonable requirement 
in this case is that the posterior state distribution, based on the observed delayed state 
and on previous controls, be determined from the present controller state. As was shown 
in [1], for additive cost functions the problem of control with delayed observations can 
be expressed as a Markov decision process without delay of a more complicated system. 
While we have obtained some results in this more challenging and realistic setting, the 
full elaboration of this issue is left for future work. A further open issue relates to ap- 
proximate, rather than exact, task performance. We expect that in this case some notion 
of approximate forward model will play a role (e.g., [2]). 

An interesting question relates to the necessity of the conditions we have provided, as 
we have only shown them to be sufficient. In fact, it is quite possible that milder conditions 
than the absence of NSCTP pairs suffice. Finally, it would clearly be of significant value 
to demonstrate the absence of NSCTP pairs, and thus the necessity of forward models, 
in more biologically relevant settings. However, proving this for nonlinear dynamical 
systems, with a level of complexity approaching that of biological systems, may require 
non-trivial analysis. We hope that simpler and mathematically more tractable conditions 
can be developed, whose existence will be easier to demonstrate. 
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Figure Captions 



Figure 1 A delayed feedback control system, where the delayed plant state is observed 
by a controller. The sequence represents a set of tasks, and the sequence Zt denotes 
the times at which tasks are switched. 

Figure 2 A simple one-dimensional example where X* = X = [—1,1], U = [—1,1], and 
the objective is to drive the system to the point x*. The exact control solution in this case 
is sgn (x* — Xt) 

Figure 3 Two accessible sets meet: The sets K{t"^,x^) and K{t"^,x'^) intersect at 
time r"^ with the point x* at the intersection with the outward normals to the support 
hyperplane. 
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